An a Priori Exponential Tail Bound for k-Folds Cross-Validation
نویسندگان
چکیده
We consider a priori generalization bounds developed in terms of cross-validation estimates and the stability of learners. In particular, we first derive an exponential Efron-Stein type tail inequality for the concentration of a general function of n independent random variables. Next, under some reasonable notion of stability, we use this exponential tail bound to analyze the concentration of the k-fold crossvalidation (KFCV) estimate around the true risk of a hypothesis generated by a general learning rule. While the accumulated literature has often attributed this concentration to the bias and variance of the estimator, our bound attributes this concentration to the stability of the learning rule and the number of folds k. This insight raises valid concerns related to the practical use of KFCV, and suggests research directions to obtain reliable empirical estimates of the actual risk.
منابع مشابه
A full NT-step O(n) infeasible interior-point method for Cartesian P_*(k) –HLCP over symmetric cones using exponential convexity
In this paper, by using the exponential convexity property of a barrier function, we propose an infeasible interior-point method for Cartesian P_*(k) horizontal linear complementarity problem over symmetric cones. The method uses Nesterov and Todd full steps, and we prove that the proposed algorithm is well define. The iteration bound coincides with the currently best iteration bound for the Ca...
متن کاملAn Exponential Tail Bound for Lq Stable Learning Rules. Application to k-Folds Cross-Validation
We consider a priori generalization bounds developed in terms of cross-validation estimates and the stability of learners. In particular, we first derive an exponential Efron-Stein type tail inequality for the concentration of a general function of n independent random variables. Next, under some reasonable notion of stability, we use this exponential tail bound to analyze the concentration of ...
متن کاملC Cross - Validation
Definition Cross-Validation is a statistical method of evaluating and comparing learning algorithms by dividing data into two segments: one used to learn or train a model and the other used to validate the model. In typical cross-validation, the training and validation sets must cross-over in successive rounds such that each data point has a chance of being validated against. The basic form of ...
متن کاملThe lower bound for the number of 1-factors in generalized Petersen graphs
In this paper, we investigate the number of 1-factors of a generalized Petersen graph $P(N,k)$ and get a lower bound for the number of 1-factors of $P(N,k)$ as $k$ is odd, which shows that the number of 1-factors of $P(N,k)$ is exponential in this case and confirms a conjecture due to Lovász and Plummer (Ann. New York Acad. Sci. 576(2006), no. 1, 389-398).
متن کاملStability of cross-validation and minmax-optimal number of folds
In this paper, we analyze the properties of cross-validation from the perspective of the stability, that is, the difference between the training error and the error of the selected model applied to any other finite sample. In both the i.i.d. and non-i.i.d. cases, we derive the upper bounds of the one-round and average test error, referred to as the one-round/convoluted Rademacher-bounds, to qua...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1706.05801 شماره
صفحات -
تاریخ انتشار 2017